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Image Coding Using Wavelet Transform 
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Abstract— Image compression is now essential for applica- 
tions such as transmission and storage in data bases. This paper 
proposes a new scheme for image compression taking into ac- 
count psychovisual features both in the space and frequency 
domains; this new method involves two steps. First, we use a 
wavelet transform in order to obtain a set of biorthogonal sub- 
classes of images; the original image is decomposed at different 
scales using a pyramidal algorithm architecture. The decom- 
position is along the vertical and horizontal directions and 
maintains constant the number of pixels required to describe 
the image. Second, according to Shannon's rate distortion the- 
ory, the wavelet coefficients are vector quantized using a multi- 
resolution codebook. Furthermore, to encode the wavelet coef- 
ficients, we propose a noise shaping bit allocation procedure 
which assumes that details at high resolution are less visible to 
the human eye. Finally, in order to allow the receiver to rec- 
ognize a picture as quickly as possible at minimum cost, we 
present a progressive transmission scheme. It is shown that the 
wavelet transform is particularly well adapted to progressive 
transmission. 

Keywords— Wavelet, biorthogonal wavelet, multtscale py- 
ramidal algorithm, vector quantization, noise shaping, pro- 
gressive transmission. 



I. Introduction 

IN many different fields, digitized images are replacing 
conventional analog images as photograph or x-rays. 
The volume of data required to describe such images 
greatly slow transmission and makes storage prohibitively 
costly. The information contained in the images must, 
therefore, be compressed by extracting only the visible 
elements, which are then encoded. The quantity of data 
involved is thus reduced substantially. 

A fundamental goal of data compression is to reduce 
the bit rate for transmission or storage while maintaining 
an acceptable fidelity or image quality. Compression can 
be achieved by transforming the data, projecting it on a 
basis of functions, and then encoding this transform. Be- 
cause of the nature of the image signal and the mecha- 
nisms of human vision, the transform used must accept 
nonstationarity and be well localized in both the space and 
frequency domains. To avoid redundancy, which hinders 
compression, the transform must be at least biorthogonal 
and lastly, in order to save CPU time, the corresponding 
algorithm must be fast. The two-dimensional wavelet 
transform defined by Meyer and Lemarie [31], [24], [25], 
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together with its implementation as described by Mallat 
[27], satisfies each of these conditions. 

The compression method we have developed associates 
a wavelet transform and a vector quantization coding 
scheme. The wavelet coefficients are coded considering a 
noise shaping bit allocation procedure. This technique ex- 
ploits the psychovisual as well as statistical redundancies 
in the image data, enabling bit rate reduction. 

Section II describes the wavelet transforms used in this 
paper. After a quick review of wavelets in general, we 
explain in more detail the properties and construction of 
regular biorthogonal wavelet bases. We then extend this 
one-dimensional construction to a two-dimensional 
scheme with separable filters. The new coding scheme is 
next presented in Section III. We focus particularly in this 
section on the statistical properties of wavelet coeffi- 
cients, on the asymptotic coding gain that can be achieved 
using vector quantization in the subimages, and on the 
optimal allocation across the subimages. Experimental re- 
sults are given in Section IV for images taken within and 
outside of the training set. 

II. Wavelets 
A. A Short Review of Wavelet Analysis 

Wavelets are functions generated from one single func- 
tion ^ by dilations and translations 

rV) = | 0 |- ,/ ^(^)- 

(For this introduction we assume / is a one-dimen- 
sional variable). The mother wavelet \p has to satisfy 
{ dx \j/(x) = 0, which implies at least some oscillations. 
(Technically speaking, the condition on \p should be 
{ dw \ V(w)\ 2 < oq, where V is the Fourier trans- 

form of \[/\ if ^(/) decays faster than | / 1 " 1 for t -» oo, then 
this condition is equivalent to the one above). The defi- 
nition of wavelets as dilates of one function means that 
high frequency wavelets correspond to a < 1 or narrow 
width, while low frequency wavelets have a > 1 or wider 
width. 

The basic idea of the wavelet transform is to represent 
any arbitrary function/ as a superposition of wavelets. 
Any such superposition decomposes / into different scale 
levels, where each level is then further decomposed with 
a resolution adapted to the level. One way to achieve such 
a decomposition writes / as an integral over a and b of 
\l/ a ' h with appropriate weighting coefficients [22]. In prac- 
tice, one prefers to write /as a discrete superposition (sum 
rather than integral). Therefore, one introduces a discre- 
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tization, a = nfc = nb*<%., with m, n e Z, and a 0 > 1 , 
b Q > 0 fixed. The wavelet decomposition is then 

/= Ec.^/)^, - (i) 

with * w , n (f) = = flo" m/ V(flo" m ' " ^o). De- 

compositions of this type were studied in [14], [15]. For 
a 0 - 2 t b 0 = 1 there exist very special choices of $ such 
that the $ m%n constitute an orthonormal basis, so that 

in this case. Different bases of this nature were con- 
structed by Stromberg [36], Meyer [31], Lemarie' [24], 
Battle [7], and Daubechies [16]. All these examples cor- 
respond to a multiresolution.analysis, a mathematical tool 
invented by Mallat [27], which is particularly well adapted 
to the use.of wavelet bases in image analysis, "and which 
gives rise to a fast computation algorithm. 

In a multiresolution analysis, one really has two func- 
tions: the mother wavelet ^ and a scaling Junction 0. One 
also introduces dilated arid translated versions of the scal- 
ing function, <t>„ tn (x) = 2" m/2 0(r w jc - n). For fixed ™, 
the <t> mt „ are orthonormal. We denote by V m the space, 
spanned by the these spaces V m describe successive 
approximation spaces, • • • V 2 C V, C V 0 .C V-i C V_ 2 
• v, each with resolution 2 m . For each m,.the „ span 
a space W m which is exactly the orthogonal complement 
in of V m \ the coefficients <^ m ,„,/>, therefore, de- 
scribe the information lost when going from an ap- 
proximation of /with resolution 2 m ~ 1 to the coarser ap- 
proximation with resolution T. All this is translated 
into the following algorithm for the computation of the 
c m , n (f) = <$ m ,n>f) (for more details, see [27]): 

fl ffl( „(A = .2A2«-*flm-!,*(/) (2) 

where g, = + I and h n = 2 I/2 J<*jc«(jc - h)4>(2x). 

In fact the a m n ( f) are coefficients characterizing the pro- 
jection of/ onto V m . If the function /is given in sampled 
form, then one can take these samples for the highest or- 
der resolution approximation coefficients a 0 „, and (2) de- 
scribes a subband coding algorithm on these sampled val- 
ues, with low : pass filter 7i and -high-pass filter g. Because 
of their association with orthonormal wavelet bases, these 
filters give exact reconstruction, i.e.: 

a m -ij(f) = Eih^^Jf) + g 2n -ic a jf)]. (3) 

Most of the orthonormal wavelet bases' have infinitely 
supported Vs corresponding to filters h and g with infi- 
nitely many taps. The construction in [16] gives \p with 
finite support, and therefore, corresponds to FIR filters. 
It follows that the orthonormal bases in [16] correspond 
to a subband coding scheme with exact reconstruction 
property, using the same FIR filters for reconstruction as 



for decomposition. Such filters are well known since the 
work of Smith and Barnwell [35] and of Vetterli [37]. The 
extra ingredient in the orthonormal wavelet decomposi- 
tion is that it writes the signal to be decomposed as a su- 
perposition of reasonably smooth elementary building 
blocks. The filters must satisfy the additional condition: 

n //(2-*£) 

' k =s 1 

. decay faster than C(l + | £ |)- e -°- 5 as | f | -+ oo, for some 
e > 0, where 

This extra regularity requirement is usually not satisfied 

by the exact reconstruction filters in the ASSP literature. 

"j' ■ . >' t ** > * • ?-*•* 

B. Applications of Wavelet Bases to Image Analysis 

1) Biorthogonal Wavelet Bases: Since images are 
mostly smooth (except for occasional edges) it seems ap- 
propriate that an exact reconstruction subband coding 
scheme for image analysis should correspond to an or- 
thonormal basis with a reasonably smooth mother wave- 
let. In order to have fast computation, the filters should 
be short (short filters lead to less smoothness, however, 
so they cannot be too short). On the other hand it is de- 
sirable that the FIR filters used be linear phase, since such 
filters can be easily cascaded in pyramidal filter structures 
without the need for phase compensation. Unfortunately, 
there are no nontrivial orthonormal linear phase FIR fil- 
ters with the exact reconstruction property [35], regard- 
less of any regularity considerations. The only symmetric 
exact reconstruction filters are those corresponding to the 
Haar basis, i.e., h 0 = Tij "= 2 1/2 and g 0 = -g } = 2 ,/2 , 
with all other h n , g n = 0. 

One can preserve linear phase (corresponding to sym- 
metry for the wavelet) by relaxing the orthonormality re- 
quirement, and using biorthogonal bases. It is then still 
possible to construct examples where the mother wavelets 
have arbitrarily high regularity. 

In such a scheme, we still decompose as in (2), but 
reconstruction becomes 

a m _ ,./(/) = X \^l<n(f) + g2n-lC m .n(f)] (4) 
n 

where the filters h, g may be different from h, g. In order 
to have exact reconstruction, we impose:. 

^ = (-ir/i_ n+1 

gn ~ (-1) + l n 

So far, we have not performed anything differently from 
the usual exact reconstruction subband coding schemes 
with synthesis filters different from the decomposition fil- 
ters. If the filters satisfy the additional condition that: 

II #(2-'|) and II tf(2 _ *£) (6a) 

*= I k = I 
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decay faster than C(l + |{ |)~' 0,5 as | £ | - o°, for some 
€ > 0, where 

/?(£) = 2- l/2 Zti n e- M H(Z) = 2- l/2 Zh n e' M 

n n 

(6b) 

then we can give the following interpretation to (2) and 
(4). Define functions <t> and by 

<f>(x) = Z h„<f>(2x - n) and 4>(x) = L h n $(2x - n). 

n n 

Their Fourier transforms are exactly the infinite products 
(6a), and they are, therefore, well-defined square inte- 
grate functions, compactly supported if the filters h and 
h are FIR. Define also 

*(x) = S g n <t>(2x - n) and fox) = Z gj(2x - n). 

n n 

Then, the a m „(/) and c m n {f) in (2) can be rewritten 



as: 



a m Jf) = <4v rt ,/> = 2- m l 1 j dx<t> m . n (x)f(x) 
c m Jf) = <^.„,/> = 2~ m/2 j dx^ n (x)f(x) 



and reconstruction is simply: 

/=2 < tfv„, />#,».«■ 



The filter bank structure with the associating wavelets 
and scaling functions is depicted on the following sub- 
band coding scheme (Fig. 1). 

If the infinite products in (6a) decay even faster than 
imposed above, then <f> and 4> and consequently ^ and \j/ 
will be reasonably smooth. Note that (7) is very similar 
to the orthonormal decomposition described in Section 
1I-A; the only difference is that the expansion of / with 
respect to the basis \p m r] _ uses coefficients computed via 
the dual basis \j/ m „ with \p different from This interpre- 
tation is not possible for all exact reconstruction subband 
coding schemes; in particular, convergence of the infinite 
products (6a) is only possible if 

Eft,, = V' 1 and ££„ = 2'/ 2 . 

n n 

Moreover, (7) can only hold if 

S(-l) ff /i n = 0 and It(-\) n K n = 0. 

n n 

Most exact reconstruction subband coding schemes do 
not satisfy these conditions. 

Biorthogonal bases of wavelets have recently been con- 
structed, with regularity simultaneously but indepen- 
dently, by Cohen, Daubechies and Feauveau [12] and by 
Herley and Vetterli [38]. Reference [12] contains a de- 
tailed mathematical study, with proofs that, under the 
conditions stated above, the wavelets do indeed constitute 
numerically stable bases (Riesz bases) and a discussion of 
necessary and sufficient conditions for regularity. In [18] 
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Fig. 1. Filter bank structure and the associating wavelets. 

Feauveau explores the construction from the point of view 
of multi resolution spaces rather than from the filters. Bas- 
ically one has two hierarchies of spaces in the bior- 
thogonal case, each corresponding to one pair of filters. 

It is shown in [12] that arbitrarily high regularity can 
be achieved by both \p and ^, provided one chooses suf- 
ficiently long filters. In particular, if the functions yp and 
\j/ are, respectively, (it - 1) and (k - 1) times continu- 
ously differentiate, then the trigonometric polynomials 
//({) and #(£) have to be divisible by (I + e~ A ) k and 
(1 + respectively, so that the length of the corre- 

sponding filters h y h has to exceed k % £. 

By (5), divisibility of /?(£) by (1 + e" Ji ) k means that ^ 
will have Jf consecutive moments zero: 



(7) j dx y^(.r) = 0, for / = 0, I, ■ ■ ■ , £ - 



I. 



For more details concerning this discussion, see [12]. 

It is well known (and it can easily be checked by using % 
Taylor expansions) that if \p has k moments zero, then the 
coefficients <^ m . n ,/> will represent functions /, which 
are £ times differentiate, with a high compression poten- 
tial (many coefficients will be negligibly small). 

Many examples of biorthogonal wavelet bases with rea- 
sonably regular \p and # can be constructed; for our ap- 
plications, the regularity of the elementary building blocks 
\p m „, which is linked to the number of zero moments of 
^, is more important than the regularity of the yjt m n or the 
numberof zero moments of Within the limits imposed 
by the support widths, we will, therefore, try to choose 
H as large as possible. 

In terms of trigonometric polynomials //(£) and //(£), 
the exact reconstruction requirement condition on h and 
R given in (5) reduces to (for symmetric filters) 

#(£)#(£) + tf(£ +*)#({ + x) = 1. (8) 
Together with divisibility of H and /?, respectively, by 
+ and (1 + <?~ ; *) f , this leads to (see [12]) 



(1 



H(£)R(l) = cos (J/2) 2 



/-i 

z 

P = o 



1 

P 



sin 



(*/2)* 



+ sin (£/2) 2 'fl(£)j 



(9) 



where /?({) is an odd polynomial in cos (£). and where 
21 = k + k (symmetry of h and h forces k + A" to be even). 



TABLE I 

Filter Coefficients for the Spline Filters with / = 3, k = 4, £ - 2 



n 


U 


± ! 


±2 


±3 . 


±4 


2-"\ 
2-"X 


45/64 
1/2 


19/64 
1/4 


-1/8 
0 


-3/64 
0' 


3/128 
0 



Many examples are possible. We have studied in par- 
ticular the following three examples, which belong to 
three different families. 

2) Spline Filters: One can choose, e.g., R a 0, with 
/?(§) = cos (f/2)V M/2 where k = 0 if £ is even, k = 1 
if k is odd. This corresponds to the filters called "spline 
filters" in [12] (because the corresponding function 0 is 
a fi-spline function) or "binomial filters** in [38] (because 
the ft are simply binomial coefficients). It then follows 
that: 



//($) - cos (£/2) 2/ - V** /2 



2 / sind/2) 2 * 



(10) 



We have looked at one example from this family; it 
corresponds to / = 3, k = 2. The coefficients /i„ and £„ 
are listed in Table I; the corresponding scaling functions 
and wavelets are plotted in Fig. 2. 

It is clear that the two filters in the first example have 
very uneven length. This is typical for all the examples in 
this family of * 4 spline filters . * * 

3) A Spline Variant with Less Dissimilar Lengths: This 
family still uses R m 0 in (9), but factorizes the right- 
hand side of (9), breaking up the polynomial of degree 
/. - 1 in sin (£/2) into a product of two polynomials in 
sin (£/2) with real coefficients, one to be allocated to H, 
the other to H, so as to make the lengths of A and h* as 
close as possible. 

The example presented here is the "smallest*' one in 
this family (shortest h and h ); it corresponds to / = 4 and 
k = 4. The filter coefficients are listed in Table .11; the 
corresponding scaling functions and wavelets are plotted 
in Fig. 3. , . 

Note that, unlike examples 1 and 3 where the 2~ l/2 /i„, 
2 " 1 /2 H n are rational , the entries in Table II are truncated 
decimal expansions of, irrational numbers. The functions 
<t> in examples 1 and 2 look very similar (compare Figs. 
2(a) and 3(a)); a more detailed analysis shows that the one 
in example 2 is more regular, however. Both correspond 
to 4 vanishing moments for 

4) Filters Close to Orthonormal Filters: Finally, there 
exist many examples for which R * 0. In particular there 
exists a special choice of R for which the two filters are 
very close to each other, and both very close to an or- 
thonormal wavelet filter. 

Surprisingly, for the first example of this series, one 
of the two filters is a Laplacian pyramid filter pro- 
posed in [9]. It corresponds to / - 2, * = 2 and 
/?(£) = 48 cos (£)/175. The filter coefficients are listed 
in Table III; the corresponding scaling functions and 




Fig. 2. Scaling functions 0, i and wavelets ^, J for example 1 (spline 
filters with I « 3; k » 4, £ = 2). (a) Scaling function <t>. (b) Scaling func-; 
tion 4>. (c) Wavelet ^. (d) Wavelet J. 

wavelets are plotted in Fig. 4. It is clear that the scaling 
functions <t> and £ are very similar, corresponding to very 
similar $ and Note that in this case, the filter coeffi- 
cients are again rational. 
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TABLE 11 

Filter Coefficients for the Spline Variant with Less Dissimilar 
Lengths, with I = 4 - A. if = 4 



n 0 


±1 


±2 


±3 


±4 


2- 1/2 h„ 0.602 949 


0.266 864 


-0.078 223 


-0.016 864 


0.026 749 


2-''% 0.557 543 


0.295 636 


-0.028 772 


-0.045 636 


0 





Fig. 3. Scaling functions <t>, 4> and wavelets & for example 2 (spline 
variant with less dissimilar lengths: / = 4 = it. k = 4). (a) Scaling function 
<t>. (b) Scaling function 4>. (c) Wavelet ^. (d) Wavelet 



TABLE III 

Filter Coefficients for Example 3. The Entries are Rational, and 
the Two Filters are very Close. The /i-Filter Coincides with a 
Laplacian Pyramid Filter Proposed in (9). In This Case 
/ = 2 = k. k~ = 2 



n 


0 


±1 


±2 


±3 


±4 


l~ u2 K 


0.6 
17/28 


0.25 
73/280 


-0.05 
-3/56 


0 

-3/280 


0 
0 



The two biorthogonal filters in this example are both 
close to an orthonormal wavelet filter of length 6 con- 
structed in [17], where it was called a ^coiflet/' Being 
an orthonormal wavelet filter, the coiflet is nonsymme- 
tric. The filters in this example are shorter than in exam- 
ples 1 and 2, but k is also smaller. The next example in 
this family corresponds to k = 4 (and / = 4); the filters h 
and h then have length 9 and 15; they are both close to a 
coiflet of length 12. 

5) Extension to the Two-Dimensional Case: There ex- 



ist various extensions of the one-dimensional wavelet 
transform to higher dimensions. We follow Mallat [27] 
and use a two-dimensional wavelet transform in which 
horizontal and vertical orientations are considered pref- 
erential. 

In two-dimensional wavelet analysis one introduces, like 
in the one-dimensional case, a scaling function 4>(x, y) 
such that: 



0(jc, y) = </>(*) 0(y) 



(ID 



where <f>(x) is a one-dimensional scaling function. 

Let \p(x) be the one-dimensional wavelet associated with 
the scaling function <t>(x). Then, the three two-dimen- 
sional wavelets are defined as: 



t H {x, y) = <t>(x)t(y) 
V{x, y) = *(x)<t>(y) 
4,°(x,y) = Mx)t(y). 



(12) 
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Fig. 4. Scaling functions i and wavelets J for example 3_(bior- 
thogonal filters close to an orthonormal wavelet filter, / = 2 = *, k « 2). 
(a) Scaling function 0. (b) Scaling function (c) Wavelet tf. (d) Wavelet 
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Fig. 5. One stage in a multiscale image decomposition. 



Fig. 5 represents one stage in a multiscale pyramidal 
decomposition of an image: wavelet coefficients of the 
image are computed, as in the one-dimensional case (Sec- 
tions II-A and II-B.l), using a subband coding algorithm. 
The filters h and g are one-dimensional filters. This de- 
composition provides subimages corresponding to differ- 
ent resolution levels and orientations (see Fig. 6). The 
reconstruction scheme of the image is presented Fig. 7. 

To compare the three different filters presented in this 
paper, we have decomposed the image Lena (Fig. 16) with 
each of these filters. The results are presented in Fig. 8. 



In Fig. 8(a) we can see the normalized detail subimages 
at different resolution levels m = 1, m = 2, and m = 3 
(wavelet coefficients) and in Fig. 8(b) the low resolution 
level subimages. 

III. Image Coding Application 
A. Statistical Properties of Wavelet Coefficients 

The performance of a coder used for a given resolution 
and direction can be determined by the statistics of the 
corresponding subimage, i.e., its probability density 
function (PDF). ' 
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Fig. 6. Image decomposition. 
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Fig. 7. One stage in a multiscale image reconstruction. 
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A typical PDF and different approximations are given 
in Fig. 9, where we plot the true PDF for resolution level 
m = 1 and direction d = vertical together with three model 
functions: a Gaussian, a Laplacian, and an intermediate 
function, the so-called generalized Gaussian [2]. 

This generalized Gaussian law is given explicitly by 

/Vrfto = a m . d exp (-| b md x\ r m ^ d ) 
with 




(13) 

where a md is the standard deviation of the subimage 
(m y d), and r( * ) is the usual Gamma function. 

The general formula (13) contains the other two ex- 
amples as particular cases: 



• r m.d - 2 leads to the well-known Gaussian PDF; 

• r m d = 1 leads to a Laplacian PDF. 

The variance of this approximation model is set equal 
to the variance of the corresponding subimage. Thus the 
parameter r m d is computed in order to match the real PDF 
using the well-known chi-squared test. In this case the 
optimum parameter was 0.7. Other experiments for other 
resolutions (except the lowest resolution) lead to very 
similar results. 

We can see in Fig. 9 that the real PDF (scale m - 1 
and vertical orientation) is closely approximated by a gen- 
eralized Gaussian law with parameter r, r = 0.7. 

B. Encoding of Wavelet Coefficients Using Vector 
Quantization 

Different techniques involving vector or scalar quanti- 
zation can be used to encode wavelet coefficients. 

According to Shannon's rate distortion theory, better 
results are always obtained when vectors rather than sea- 
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32*i£>ix H| • || 

.(b) ■ . ; " 

■• ■ ■ Fig. 8. Comparison among the different sub images, (a) Comparison among 

the normalized detail subimages. (b) Comparison among the low resolution 
level subimages. 



lars are encoded. Therefore, the present application uses 
vector quantization. 

L Principle of Vector Quantization: Developed re- 
cently by Gersho and Gray (1980) [20], El], vector quan- 
tization has proven to be a powerful tool for digital image 
compression [4], [29], [30], [32], [39]. The principle in- 
volves encoding a sequence of samples (vector) rather than 
encoding each sample individually. Encoding is per- 
formed by approximating the sequence to be coded by a 
vector belonging to a catalogue of shapes, usually known 
as a codebook. 

The codebook is created and optimized using the well- 
known Linde-Buzo-Gray (LBG) [26] classification al- 



gorithm with a mean squared error (MSE) criterion. This 
algorithm is designed to perform a classification based on 
a training set comprised of vectors belonging to different 
images; it converges iteratively toward a locally optimal 
codebook. 

Each of the vectors in the codebook is indexed. At the 
encoding stage, the index of the vector in the codebook 
most closely describing (in terms of MSE criterion) the 
sample set to be encoded is selected to represent this set. 
Of course, in order to reconstruct the sample set, the de- 
coder must have the same codebook as the coder. 

The encoding/decoding scheme depicted in Fig. 10 was 
proposed in [29] and [30] for orthonormal wavelets. 
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Fig. 10. Encoding/decoding scheme. 



2) Comparative Performances of Vector Quantization 
(VQ) and Scalar Quantization (SQ): According to [3], 
[13], [19], [43], [30] the asymptotic lower bound distor- 
tion gain obtained when VQ, rather than SQ, is applied 
to a subimage is expressed as: 



[\[p m M)l l/(c + l) dxJ + " 

^ r [« -i(c + ***.</) 

[J [p».-/W]*-' /(f + Urf, rfjrJ 

(14) 

for a subimage corresponding to resolution m and direc- 
tion d. p m%d (x) is the PDF of wavelet coefficients of the 
subimage with resolution m and direction d. 

Here, the MSE criterion is used as a distortion measure 
(c = 2). The values of A(k m dt 2) used are the upper 
bounds of the MSE computed and tabulated by Conway 
and Sloane for vector size k m%d [13]. This formula gives 
an indication of the minimum theoretical gain that can be 
obtained. 

However, this approximation is valid only for small 
quantization errors, i.e., for a high bit rate R m d . Thus the 
gain Gn%on\y gives here an asymptotic indication. 

In Fig. 1 1, the curves of Gl Q d are plotted as a function 
of the vector dimension k md for the Laplacian, Gaussian, 




B 2 4 6 B 10 12 14 It 



Vector dimension k 
Fig. 11. Asymptotic lower bound distortion gain = function (k„, d ). 

and generalized Gaussian approximation laws, and for a 
subimage at scale m = 1 and vertical orientation. Exper- 
imental results are closely matched by the theoretical re- 
sults for a generalized Gaussian law with r m d = 0.7 ex- 
cept for the lower subband. Therefore, all computations 
based on this approximation law show that, in each sub- 
band, VQ outperforms SQ (see Fig. 11). 

In summary VQ performs better for coding wavelet 
coefficients. 

3) Generation of a Multiresolution Codebook: The 
preceding paragraph explained why VQ outperforms other 
methods. Nonetheless, major problems are encountered 
in the VQ of images. 

• It is impossible to create a universal codebook (effi- 
cient for each image to be encoded). 



m ' d (c + \)A{k m ,, c) 
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• The LBG algorithm smooths high frequencies (loss 
of resolution). 

• There is a trade-off between low distortion and high 
compression rate (computational cost). 

• It is not easy to take into account the properties of 
the human visual system [28], [33]. 

The use of the wavelet transform (i.e., multiresolution) 
is one way of overcoming these different problems. 

The wavelet decomposition of an image enables the 
generation of a codebook containing two-dimensional 
vectors for each resolution level and preferential direc- 
tion (horizontal, vertical, and diagonal). Each of these 
subcodebooks (see Fig. 12) is generated using the LBG 
algorithm. 

• The training set is comprised of vectors belonging to 
different images corresponding to the resolution and ori- 
entation under consideration. 

• The initial codebook is generated by splitting the 
centroid (center of gravity) of this training set [21]. 

A multiresolution codebook can thus be obtained by as- 
sembling all of these resulting subcodebooks. Each sub- 
codebook has a low distortion level and contains few 
words, which clearly facilitates the search for the best 
coding vector; the coding computational load is reduced, 
because only the appropriate subcodebook (resolution di- 
rection) of the multiresolution codebook is checked for 
each input vector. In addition, the quality of the coded 
image is better. The multiresolution codebook is depicted 
in Fig. 12. 

Global codebook design has drawbacks in that it results 
in edge smoothing while the proposed method preserves 
edges. In fact, each subcodebook contains the shape of 
the wavelet coefficients which are most highly represen- 
tative in terms of the MSE criterion. 

Since the spatial and frequency aspects of the image are 
taken into account in the wavelet decomposition, the clas- 
sification and search during the encoding of a subimage 
vector can be achieved using a simple criterion such as 
least mean squares. This frees us from using distortion 
measurements such as weighted least mean squares or 
other measurements involving perceptual factors. These 
algorithms are indeed costly in computation time. 

C. Optimal Bit Allocation 

Multiresolution exploits the eye' s masking effects, and 
therefore, enables us to refine and select the type of cod- 
ing according to the resolution level and the contour ori- 
entation. Although a flat noise shape minimizes the MSE 
criterion, it is generally not optimal for a subjective qual- 
ity of image. To apply noise shaping across the VQ sub- 
images, we define a total weighted MSE distortion D*(R T ) 
,((17)) for a total bit rate R T ((18)). 

Let us define D md (R md ) the average distortion in the 
coding of the subimage (w, d) for R m d bits per pixel: 



D m . d (R m . d ) = E(\x - q(x)\ c ) = d(x, q(x)) 



c > 1 
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Fig. 12. Multiresolution codebook. 

for all coefficients x belonging to the subimage, q(x) being 
the quantization of x. 

Total distortion of the image for a total rate of R T bits 
per pixel is then given by: 

M .3 



L wi - I Z d ~\ 

(16) 

where corresponds to the distortion in the sub- 

image of lowest resolution M (texture subimage). 

The problem of finding an optimal bit assignment (in 
bits per pixel) for each subimage vector quantizer is then 
formulated as: 



1 



Min 

Rm.d 



M .3 

+ S S D m .AR m .j) x B m , d (17) 

m-\i. d=\ J 
I M 1 3 

subject to: R T = -^ R$ + E Z R m , d ,(18) 

L m-\ L d = I 

where l& corresponds to, the bit allocation, in bits per. 
pixel, of lowest resolution M subimage. 

Assignment of the weights is based on the fact that the 
human eye is not equally sensitive to signals at all spatial 
frequencies. On the basis of contrast sensitivity data col- 
lected by Campbell and Robson [10], and to obtain a con- 
trolled degree of noise shaping across the subimages, we 
consider a function B m d such that: 

(19) 

where a m d is the standard deviation corresponding to sub- 
image (m y ,d) and the values of y and P nKd are chosen 
experimentally in order to match human vision. 

Dj (R T ) is the total weighted encoding distortion func- 
tion, and M is the lowest resolution considered. 

The expression of D md (R m , d ) is given by [19] 



B nKd - y m log (o 2 lr) 



-f Rm.J 



x of m<J (p, r), c > \ 



with 



xjjlfc. 



(15) 



(20) 
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This minimization problem can be solved by using La- 
grangian multipliers. Using this technique, we must solve 
the following equation: 



(21) 



where X is a Lagrangian multiplier. 
Using (17) and (20), this equation becomes: 

d 



+ s 



-cR m .d 



(22) 

Taking the partial derivative with respect to R m d yields 
an expression for R m d in terms of X: 

In 2)a m d (p f c)B m ^ ^ 



C 



(c 



By substituting (23) into the constraint (18) of the min- 
imization problem we obtain an expression of the Lagran- 
gian multiplier X 

X = cln2[2-^- (, / 4W)/ ^> n n 



• [a m .Ap, c)S„, rf ]'/ 4 " 



(24) 



Finally, substituting X into (23) results in an expression 
of the optimal bit assignment R m ^ (in bits per pixel 
(bpp)) to the vector quantizer of subimage (m, d): 

4 M R T - Rft 1 , 

R m.d^ = 7M 1 + - '°g2 



4 « _ ! 



c r " 
-V m ' ■ 1 



This expression requires the knowledge of the sub- 
image's PDF's. 

The optimal distortion of the quantizer, D*^(R T ), is 
then computed by combining (25) and (17). We find: 



DU*r) = o3W> + ^r 1 2-*""' -^ ,/4 "- 1 

[M 3 ^ -i4**/4" - I 

II S O^.*)" 4 " 

(26) 

Finally, bit allocation which is a function of the image 
will be transmitted as side information requiring only a 
few bits. 



4" - 1 



IV. Experimental Results 

The images used are sampled 256 by 256 black and 
white images. The intensity of each pixel is coded on 256 
grey levels (8 bpp). 

The numerical evaluation of the coder's performance is 
achieved by computing the peak signal-to-noise ratio 
(PSNR) between the original image and the coded image. 

For each coded image, we can use a variable length 
code. We also give the corresponding (R T if an optimal 
entropy coding was performed, defined as follows. 

To the L codewords wy t j = 1 , 2, • • ■ , L of the vector 
quantizer corresponds to L regions (clusters) of IR A , 
= I, 2, • • • , L. The yth region is defined by 

<?j = {xeJR k /Q(x) = wj} 

and represents the subset of vectors of IR* which are well 
matched by the codeword Wj of the codebook. 

Thus for each resolution and direction, we can intro- 
duce the average information of the codebook, called the 
entropy measure: 

I L 

&m.d = -7— X 2 p(Wj) l0g 2 p(w,) bpp 
K m,d >-« 

where p(wj) is the probability of selecting the source vec- 
tor w jy belonging to the codebook at scale m and corre- 
sponding to the orientation d, during the coding of the 
image (m, d). 

Then, as in (18), CR r is the sum of the estimated entropy 
in each subimage as follows: 

1 M 1 3 

I m = \ l d = 1 

The vector quantizer used is a full search quantizer, 
i.e., during the coding, all of the vectors in the subcode- 
book corresponding to the resolution and direction to be 
encoded are searched. The selection criterion used is the 
MSE criterion. 



II [a m .Ap.c)B m , d .)^r' i "- i 

d — I 



(25) 



A. Comparison Between the Different Wavelets 

In the following, we present results obtained with the 
Lena image (image within the training set) for a real bit 
rate of 1 bpp and using the three different filters proposed 
in Section II-B. (Fig. 13 corresponds to filters 9-3 pre- 
sented in example 1, Fig. 14 corresponds to filters 9-7 
presented in example 2, and Fig. 15 corresponds to filters 
5-7 presented in example 3.) Here, the Lena image is 
taken as part of the training set in order to minimize the 
effects of quantization noise: this enables the influence of 
the filters to be taken into account. 

For a given set of filters, separate codebooks are trained 
for each resolution-orientation subimage, and bit alloca- 
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Fig. 13. Filters no. 1 9-3. PSNR = 31.82 dB. <ft T = 0.80 bpp. .^g- "«/ Filters no. 3, 5-7. PSNR = 31.46 dB, fl T = 0.80 bpp. 







.Fig. 14. Filters no. 2, 9-7, PSNR = 32.10 dB, (R r « 0.78 bpp.- 0 



Fig. 16. Original 256 by 256 Lena. 8 bpp. 



tion is carried out according to (25). For the Lena image, 
the bit assignment is represented in Fig. 17. Resolution 1 
(diagonal orientation) is discarded. Resojution 1 (hori- 
zontal and vertical orientations) and resolution 2 (diago- 
nal orientation) are coded using 256-vector codebooks 
(codeword size 4 by 4) resulting in a 0.5-b /pixel rate, 
while resolution 2 (horizontal and vertical orientations) is 
coded at a 2-b/pixel rate using 256-vector codebooks 



(codeword size 2 by 2). Finally, the lowest resolution is 
coded at 8 b/ pixel. 



B. Results as a Function of Regularity and Vanishing 
Moments 

in Section II-B, we mentioned our belief that both the 
regularity of the reconstruction wavelet $ and the number 
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Fig. 17. Subimages bit rate allocation: example of a bit allocation for a 
total bit rate of 1 bpp and for the 256 by 256 Lena image. 



of vanishing moments of the analyzing wavelet \j/ are im- 
portant in applications. To illustrate this we carried out 
the following experiments. For a given pair, h, fi, we ana- 
lyzed the same image twice: once as described above, and 
a second time after exchanging the roles of the filters h 
and fi. 

The filter pairs in example 2 both have the same number 
of vanishing moments, k = £ = 4. However, $ is con- 
siderably more regular than ^ (see Fig. 3). With this filter 
pair, our experiment on the Lena image led to a PSNR of 
32.10 dB in the first case, and to a PSNR of 31.51 dB if 
the roles of h and ft are inverted. The case where the re- 
construction wavelet has the highest regularity, therefore, 
performs best. 

In example 1 the functions \p and ^ have comparable 
regularity: both are continuous and neither has a contin- 
uous derivative. In fact ^ is a bit more regular than ^: # 
is differentiable almost everywhere, and is Holder contin- 
uous with exponent 1 , while ^ is Holder continuous with 
the exponent only at 0.83. On the other hand, ^ has 2 
vanishing moments, while $ has 4 (k = 4, ft = 2). The 
same experiment, again with the Lena image, now leads 
to a PSNR of 31.82 dB ifh y ft are taken as in Table I, and 
to a PSNR of 31.13 dB when the roles of h and fi are 
reversed. The situation where \p is most regular but ^ has 
fewer vanishing moments, therefore, performs better (gain 
of 0.69 dB) than the case where ^ has more vanishing 
moments but $ is less regular. This seems to suggest that 
the regularity of # has a larger effect than the number of 
vanishing moments of \p. However, in this example the 
difference in overall regularity, as measured by the dif- 
ferences between Holder exponents, is much smaller here 
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than in example 2 (0.17 as compared to 0.63 in example 
2), and it seems hard to explain how this smaller differ- 
ence in Holder exponent could account for a comparable 
gain in PSNR. Jn fact, the Holder exponent is not a very 
good measure for the regularity of # in this case: it is 
completely determined by the discontinuity of the deriv- 
ative of \j/ in only a few points, and it is insensitive to the 
fact that $ is infinitely differentiable in all other points. If 
this is taken into account, then \p looks much more regular 
than \p (the Holder exponent of which is determined by its 
behavior near a dense set of points), which might explain 
the gain in PSNR. 

We conclude from all this that: 1) for the same number 
of vanishing moments for \p, the scheme with most reg- 
ular \p is likely to perform best; and 2) increasing the reg- 
ularity of even at the expense of the number of vanish- 
ing moments for ^, may lead to better results. 

Based on theoretical arguments (Taylor expansions) and 
results from numerical analysis [8], we also expect: 3) for 
comparable regularity of the scheme with largest van- 
ishing moments for \p is likely to perform best. 

C. Comparison with Other Coders 

If the PSNR is chosen as a criterion of comparison, 
these results are close to those obtained by Woods and 
O'Neil [42] and Westerink et at. [40]. However, in their 
subband coding algorithm, they use 32-taps Johnston fil- 
ters, while only 9 or 7 taps are necessary for our method. 
According to Westerink's results in [41], the PSNR de- 
creases by about 2 dB when using 8-taps Johnston filters. 
However, some others new QMF designs can also lead to 
good results with about 9 taps for image coding 

In this section, we present both numerical and qualita- 
tive comparison between our coding scheme and other 
previously published results. Since the most popular im- 
age in the recent literature has been the 512 by 512 Lena 
image, the comparison is made using this image taken 
outside the training set. 

Among the different methods published, we consider 
the three following well-known methods: Ho and Gersho 
obtained a 30.93-dB PSNR at 0.36 bpp, result using 
"variable-rate multi stage VQ" [23]. Riskin and Gray 
improved on the full search VQ (PSNR = 29.29 dB, 0.32 
bpp) using pruned tree structured VQ (PSNR = 30.92 dB, 
0.32 bpp) [34]. High PSNR values were obtained by 
Woods and Cohen using entropy coded and predictive VQ 
(PSNR = 32.5 dB, 0.45 bpp) [II]. 

Our aim is not to optimize the PSNR but rather a 
weighted function of the MSE in order to match human 
vision. We give two examples at low bit rate using 
wavelet VQ. 

Our initial result at 0.37 bpp presented Fig. 18 with a 
30.85-dB PSNR is very close to those of Ho and Gersho 
[23] and Riskin et al [34]. The perceptual quality of our 
coded images is better than indicated by the PSNR value 
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Fig. 18. 512 by 512 Lena image. Filters no. 2 9-7, PSNR = 30.85 dB. 
<R 7 = 0.37 bpp. 




Fig. 19. 512 by 512 Lena image. Filters no, 2 9-7, PSNR = 29.1l'dB, 
■&.' T = 0.21 bpp. ./ 



.mainly due to the regularity of the wavelet and the bit 
allocation. These images do not suffer from the blocking 
effects obtained when using VQ in the spajial domain. No 
ringing effects can be observed. 

The second result at 0.21 bpp presented in Fig. 19 with 
a 29. 11-dB PSNR shows that a very low bit rate can be 
achieved with our method; without severe degradation. 

Gur method using a new class of filters derived from 
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wavelet theory using full search VQ can be improved by 
any of the three above-mentioned methods. 
In fact the LBG clustering algorithm is a very simple 
v algorithm but not optimal for variable length code. The 
PSNR of the method could be improved by about 3 dB,' 
for example, using ECVQ [34] but CPU time becomes 
prohibitively expensive. " 



D. Progressive Transmission Scheme^ 
. Trie main objective of progressive transmission is to 
allow the receiver to recognize a picture as quickly as pos- 
sible at minimum cbst; by sending a low resolution level 
picture first. Then, it can be decided to receiver further 
. picture details or to abort the transmission. Further details 
"of the picture are obtained by sequentially receiving the 
encoded wavelet coefficients at different resolution levels 
and directions. 

Following the example of [40], we will display each 
picture level during the progressive transmission with a 
size that matches the resolution of that particular level. 

To test the efficiency of the vector quantizer, the image 
to be coded is taken outside the training set./ 

Fig. 20 represents 5 stages in the progressive transmisr 
* sion of a 256 by 256 image using filters 9-7 given in;ex- * 
ample 2. According :to the bit allocation procedure (Sec- 
tion III-C) wiw - a generalized Gaussian PDF 
approximation law, only the wavelet coefficients, corre- 
sponding to the m = 1 and m = 2 high resolution levels 
are vector quantized, while the low level subimages 
(m > 2) are scalar quantized. -> : • 

,V. Conclusion . 
This paper describes a new image coding scheme com- 
bining the wavelet transform and VQ. 

A new family of filters has been "derived from the 
wavelet theory . We have shown the importance of regu- 
larity and vanishing moments for image coding. Further- 
- more, these filters require few taps, unlike standard QMF 
methods. ' 

- • The wavelet transform used here attempts to exploit the 
masking effect of the human eye, yielding encouraging 
results. Indeed, the proposed method enables high 
compression bit rates while maintaining good- visual qual- 
ity through the use of bit allocation iri the subimages. The 
blocking effects seen when spatial VQ is performed are 
avoided. ♦ : 

This method is well adapted to progressive transmis- 
sion as well as very low bit rate compression. Further- 
more, using a simple full-search VQ provides good re- 
sults, comparable to the best results published currently . *' 
Further research should include some new derivation 
such as entropy constraint and predictive VQ. We would' 
improve this coding scheme,. if we accept a heavier com- 
putational load. ; 
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